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Abstract. This paper presents SunPy (version 0.5), a community-developed Python 
package for solar physics. Python, a free, cross-platform, general-purpose, high- 
level programming language, has seen widespread adoption among the scientific 
community, resulting in the availability of a large number of software packages, 
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from numerical computation (NumPy, SciPy) and machine learning (scikit-learn) 
to visualisation and plotting (matplotlib). SunPy is a data-analysis environment 
specialising in providing the software necessary to analyse solar and heliospheric 
data in Python. SunPy is open-source software (BSD licence) and has an open and 
transparent development workflow that anyone can contribute to. SunPy provides 
access to solar data through integration with the Virtual Solar Observatory (VSO), 
the Heliophysics Event Knowledgebase (HEK), and the HELiopliysics Integrated 
Observatory (HELIO) webservices. It currently supports image data from major solar 
missions (e.g., SDO, SOHO, STEREO, and IRIS), time-series data from missions such 
as GOES, SDO/ EVE, and PROBA2 /LYRA, and radio spectra from e-Callisto and 
STEREO/S WAVES. We describe SunPy’s functionality, provide examples of solar 
data analysis in SunPy, and show how Python-based solar data-analysis can leverage 
the many existing tools already available in Python. We discuss the future goals of 
the project and encourage interested users to become involved in the planning and 
development of SunPy. 
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Science is driven by the analysis of data of ever-growing variety and complexity. 
Advances in sensor technology, combined with the availability of inexpensive storage, 
have led to rapid increases in the amount of data available to scientists in almost every 
discipline. Solar physics is no exception to this trend. For example, NASA’s Solar 
Dynamics Observatory ( SDO ) spacecraft, launched in February 2010, produces over 1 
TB of data per day (Pesnell et ah. 2012). Managing and analysing these data requires 
increasingly sophisticated software tools. These tools should be robust, easy to use 
and modify, have a transparent development history, and conform to modern software¬ 
engineering standards. Software with these qualities provide a strong foundation that 
can support the needs of the community as data volumes grow and science questions 
evolve. 

The SunPy project aims to provide a software package with these qualities for the 
analysis and visualisation of solar data. SunPy makes use of Python and scientific 
Python packages. Python is a free, general-purpose, powerful, and easy-to-lcarn 
high-level programming language. Additionally, Python is widely used outside of 
scientific fields in areas such as ‘big data’ analytics, web development, and educational 
environments. For example, pandas (McKinney 2010, 2012) was originally developed 
for quantitative analysis of financial data and has since grown into a generalised time- 
series data-analysis package. Python continues to see increased use in the astronomy 
community (Greenfield, 2011), which has similar goals and requirements as the solar 
physics community. Finally, Python integrates well with many technologies such as web 


servers (Dolgert et ah, 2008) and databases 


The development of a package such as SunPy is made possible by the rich ecosystem 
of scientific packages available in Python. Core packages such as NumPy, SciPy (Jones 


et ah, 2001), and matplotlib (Hunter, 2007) provide the basic functionality expected 


of a scientific programming language, such as array manipulation, core numerical 


algorithms, and visualisation, respectively. Building upon these foundations, packages 

such as astropy (astronomy; Astropy Collaboration et ah, 

2013), pandas (time-series; 

McKinney 2012), and seikit-image (image processing; 

van der Walt et al. 2014) 


provide more domain-specific functionality. 

A typical workflow begins with a solar physicist manually identifying a small 
number of events of interest on the Sun. This is typically done in order to investigate in 
detail the physics of these events (for example, the large solar flare of 23 July 2002 has 
Astrophysical Journal Letters volume 595, dedicated to its analysis). In this workflow, 
an event is investigated in depth which requires data from many different instruments. 
These data are typically provided in many different formats - for example, FITS (Flexible 

and contain many 
In addition, the 


Image Transport System, Pence et ah, 2010), CSV, or binary hies 


different types of data (such as images, lightcurves and spectra), 
repositories these data reside in can have different access methods. This workflow is 
characterized by the large number of heterogeneous datasets used in the investigation 
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of a small number of solar events. 

Another typical workflow begins with the solar physicist identifying a large sample 
of data or events. The goal here is obtain information about the population in general. 
An example might be to calculate the fractal dimension of a large number of active region 


magnetic fields ( 

McAteer et al. 

2005) 

, or to calculate the observed temperatures in a 

population of solar flares (iRyan et al. 

2012 

). This workflow is typically characterized 


by lower data heterogeneity, but with a larger number of hies. 

The volume and variety of solar data used in these workflows drives the need for 
an environment in which obtaining and performing common solar physics operations on 
these data is as simple and intuitive as possible. SunPy is designed to be a clean, sirnplc- 
to-use, and well-structured open-source package that provides the core tools for solar 
data analysis, motivated by the need for a free and modern alternative to the existing 


SolarSoft (SSW) library (Freeland and Handy, 1998). While SSW is open source and 
freely available, it relies on IDL (Interactive Data Language), a proprietary data-analysis 
environment. 

The purpose of this paper is to provide an overview of SunPy’s current capabilities, 
an overview of the project’s development model, community aspects of the project, 
and future plans. The latest release of SunPy, version 0.5, can be downloaded 
from http://sunpy.org or can be installed using the Python package index (http: 
//pypi •python.org/pypi). 


2. Core Data Types 

The core of SunPy is a set of data structures that are specifically designed for the three 
primary varieties of solar physics data: images, time series, and spectra. These core 
data types are supported by the SunPy classes: Map (2D spatial data), LightCurve (ID 
temporal series), and Spectrum and Spectrogram (ID and 2D spectra). The purpose of 
these classes is to provide the same core data type to the SunPy user regardless of the 
differences in source data. For example, if two different instruments use different time 
formats to describe the observation time of their images, the corresponding SunPy Map 
object for each of them expresses the observation time in the same way. This simplifies 
the workflow for the user when handling data from multiple sources. 

These classes allow access to the data and associated metadata and provide 
appropriate convenience functions to enable analysis and visualisation. For each of these 
classes, the data is stored in the data attribute, while the metadata is stored in the meta 
attributcjf] It is possible to instantiate the data types from various different sources: 
e.g., files, URLs, and arrays. In order to provide instrument-specific specialisation, the 
core SunPy classes make use of subclassing; e.g., Map has an AIAMap sub-type for data 


from the SDO/AIA (Atmospheric Imaging Assembly; Lemen et al. 2012) instrument. 

All of the core SunPy data types include visualisation methods that are tailored to 
each data type. These visualisation methods all utilise the matplotlib package and are 


f Note, that currently only Map and LightCurve have this feature fully implemented. 
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designed in such a way that they integrate well with the pyplot functional interface of 

matplotlib. 

This design philosophy makes the behaviour of SunPy’s visualisation routines 
intuitive to those who already understand the matplotlib interface, as well as allowing 
the use of the standard matplotlib commands to manipulate the plot parameters (e.g., 
title, axes). Data visualisation is provided by two functions: peek(), for quick plotting, 
and plot(), for plotting with more fine-grained control. 

This section will give a brief overview of the current functionality of each of the 
core SunPy data types. 


2.1. Map 


The map data type stores 2D spatial data, such as images of the Sun and inner 
heliosphere. It provides: a wrapper around a numpy data array, the images associated 
spatial coordinates, and other metadata. The Map class provides methods for typical 
operations on 2D data, such as rotation and re-sampling, as well as visualisation. The 
Map class also provides a convenient interface for loading data from a variety of sources, 
including from FITS files, the standard format for storing image data in solar physics 
and astrophysics community. An example of creating a Map object from a FITS hie is 
shown in Listing [lj 

The architecture of the map subpackage consists of a template map called 
GenericMap, which is a subclass of astropy.nddata.NDData. NDData is a generic 
wrapper around a numpy. ndarray with a met a attribute to store metadata. As NDData is 
currently still in development, GenericMap does not yet make full use of its capabilities, 
but this inheritance structure provides for future integration with astropy. In order 
to provide instrument- or detector-specific integration, GenericMap is designed to be 
subclassed. Each subclass of GenericMap can register with the Map creation factory, 
which will then automatically return an instance of the specific GenericMap subclass 
dependent upon the data provided. SunPy v0.5 has GenericMap specialisations for the 
following instruments: 


Yohkoh Solar X-ray Telescope (SXT, Ogawara et al. 1991 |Tsuneta et al., 1991[), 


Solar and Heliospheric Observatory ( SOPIO , Domingo et al., 1995) Extreme 


Ultraviolet Telescope (EIT; Delaboudiniere et al.[ 1995) 

SOPIO Large Angle Spectroscopic COronagraph (LASCO, Brueckner et al. 1995) 


RPIESSI - Reuven Rarnaty High Energy Solar Spectroscopic Imager (Lin et al. 


2002 ), 


Solar TErrestrial RElations Observatory ( STEREO , Kaiser, 2005) Extreme 
Ultraviolet Imager (EUVI, (Wuelser et al., 2004)) 


STEREO CORonagraph 1/2 (COR 1/2, Howard et ah, 2002) 

Hinode XRT - X-Ray Telescope ( |Kosugi et al. 2007 Golub et al., 2007). 
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PRojects for On Board Autonomy 2 ( PROBA2 , Santandrea et al. 2013) Sun 
Watcher Active Pixel (SWAP; Seaton et al. |2013 ) 

SDO AIA and Helioseismic Magnetic Imager, (HMI, Scherrer et al. 2012) 


Interface Region Imaging Spectrograph (IRIS, Lernen et ah, 2011) SJI (slit-jaw 
imager) frames. 


The GenericMap class stores all of the metadata retrieved from the header 
of the image hie in the met a attribute and provides convenience properties for 
commonly accessed metadata: e.g., instrument, wavelength or coordinate_system. 
These properties are dynamic mappings to the underlying metadata and all methods 
of the GenericMap class modify the meta data where needed. For example, if 
aiamap.meta[ ^ instrume , ] is modified then aiamap. instrument will reflect this 
change. Currently this is implemented by not preserving the keywords of the input 
data, instead modifying meta data to a set of “standard” keys supported by SunPy. 
Listing [l] demonstrates the quick-look functionality of Map. 
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»> import sunpy . map 

»> aiamap = sunpy . map. Map (sunpy . AIA 171 IMAGE) 

»> smap = aiamap . submap ([ —1200 , —200], [ — 1000, — 0]) 
»> smap . peek ( draw.gricl=True) 



-1200 -1000 -800 -600 -400 -200 

X-position [arcsec] 


listing 1: Example of the AIAMap specialisation of GenericMap. First, a map is created 
from a sample SDO/AIA FITS file. In this case, a demonstration hie contained within 
the SunPy repository is used. A cutout of the full map is then created by specifying 
the desired solar-x and solar-?/ ranges of the plot in data coordinates (in this case, 
arcseconds), and then a quick-view plot is created with lines of hcliographic longitude 
and latitude over-plotted. 

In addition to the data-type classes, the map subpackage provides two collection 
classes, CompositeMap and MapCube, for spatially and temporally aligned data 
respectively. CompositeMap provides methods for overlaying spatially aligned data, with 
support for visualisation of images and contour lines overlaid upon each other. MapCube 
provides methods for animation of its series of Map objects. Listings [2] and [3] show how 
to interact with these classes. 
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»> import sunpy . map 

»> import matplotlib . pyplot as pit 

»> compmap = sunpy . map. Map(” aia_1600.image . f i t s ” , ” RHESSI_image . f i t s ” 

composite=True) 

»> compmap. set ^levels (1 , range (0, 50, 5), percent=True) 

»> compmap. set _colors (1 , ”Reds_r”) 
lot the result and crop 
»> ax = pit . subplot () 

»> compmap . plot () 

»> ax . axis ([200 , 600, —600, —200]) 

»> pit . show () 


SunPv Composite Plot 



200 250 300 350 400 450 500 550 600 
X-position [arcsec] 


listing 2: Example showing the functionality of CompositeMap, with RHESSI X-ray 
image data composited on top of an SDO/AIA 1600 A image. The CompositeMap is 
plotted using the integration with the matplotlib. pyplot interface. 
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»> import sunpy. map 

»> cubemap = sunpy . map.Map(” aia levl 1 71 a_2014_01 * f i t s ” , cube=True) 
»> cubemap . peek () 


SDO 171 2014-01-02T13:00:11.35 


1000 


500 


0 


-500 


-1000 



-1000 -500 0 500 1000 


X-position [arcsec] 




listing 3: Example showing the creation of a MapCube from a list of AIA image hies. The 
resultant plot makes use of matplotlib's interactive widgets to allow scrolling through 
the MapCube. 


2.2. Lightcurve 


Time series data and their analyses are a fundamental part of solar physics for 
which many data sources are available. SunPy provides a LightCurve class with a 
convenient and consistent interface for handling solar time-series data. The main engine 
behind the LightCurve class is the pandas data analysis library. LightCurve’s data 
attribute is a pandas . DataFrame object. The pandas library contains a large amount of 
functionality for manipulating and analysing time-series data, making it an ideal basis 
for LightCurve. LightCurve assumes that the input data are time-ordered list(s) of 
numbers, and each list becomes a column in the pandas DataFrame object. 

Currently, the LightCurve class is compatible with the following data sources: 
the Geostationary Operational Environmental Satellite (GOES) X-ray Sensor (XRS), 
the Nobeyama Radioheliograph (NoRH), PROBA2 Large Yield Radiometer (LYRA, 
Dominique et al. 2013), RHESSI, SDO EUV Variability Experiment]]] (EVE, Woods 


et al. 2012). LightCurve also supports a number of solar summary indices - such as 
average sunspot number - that are provided by the National Oceanic and Atmospheric 
Administration (NOAA). For each of these sources, a subclass of the LightCurve 
object is initialised (e.g., GOESLightCurve) which inherits from LightCurve, but allows 


f Note that only the level “OCS” and average CSV files is currently implemented - see http: 
//lasp.Colorado.edu/home/eve/data/ 
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instrument-specific functionality to be included. Future developments will introduce 
support for additional instruments and data products, as well as implementing an 
interface similar to that of Map. Since there is no established standard as to how time- 
series data should be stored and distributed, each SunPy LightCurve object subclass 
provides the ability to download its corresponding specific data format in its constructor 
and parse that hie type. A more general download interface is currently in development. 

A LightCurve object may be created using a number of different methods. For 
example, a LightCurve may be created for a specific instrument given an input time 
range. In Listing |4| the LightCurve constructor searches a remote source for the GOES 
X-ray data specified by the time interval, downloads the required hies, and subsequently 
creates and plots the object. Alternatively, if the data hie already exists on the local 
system, the LightCurve object may be initialised using that hie as input. 

»> from sunpy import light curve 
»> from sunpy. time import TimeRange 

»> tr = TimeRange(”2011-06-07 06:00”, ”2011-06-07 08:00”) 

»> goes = light curve . GOESLightCurve . create (tr ) 

»> goes . peek () 

»> print (’The max flux is ’ + st r ( goes . data [’ xrsb ’]. max ()) + 
’at ’ + str ( goes . data [ ’ xrsb ’ ] . idxrnax ())) 

The max flux is 2.5554e-05 at 2011-06-07 06:41:24.118999 



listing 4: Example retrieval of a GOES lightcurve using a time range and the output of 
the peekO method. The maximum hux value in the GOES 1.0-8.0A channel is then 
retrieved along with the location in time of the maximum. 
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SunPy aims to provide broad support for solar spectroscopy instruments. The variety 
and complexity of these instruments and their resultant datasets makes this a challenging 
goal. The spectra module implements a Spectrum class for ID data (intensity as a 
function of frequency) and a Spectrogram class for 2D data (intensity as a function of 
time and frequency). Each of these classes uses a numpy.ndarray object as its data 
attribute. 

As with other SunPy data types, the Spectrogram class has been built so 
that each instrument initialises using a subclass containing the instrument-specific 
functionalities. The common functionality provided by the base Spectrogram class 
includes joining different time ranges and frequencies, performing frequency-dependent 
background subtraction, and convenient visualization and sampling of the data. 
Currently, the Spectrogram class supports radio spectrograms from the e-Callisto ( 
http://www.e-callisto.org/) solar radio spectrometer network (Benz et ah, 2009) 
and STEREO/SWAVES spectrograms (Bougeret et al., 2008). 

Listing[5]shows how the CallistoSpectrogram object retrieves spectrogram data in 
the time range specified. When the data is requested using the from_range() function, 
the object merges all the downloaded hies into a single spectrogram, across time and 
frequency. In the example shown, data is provided in two frequency ranges: 20- 
90 MHz and 55-355 MHz. Since the data are not evenly spaced in the frequency range, 
the Spectrogram object linearises the frequency axis to assist analysis. The example 
also demonstrates the implemented background subtraction method, which calculates a 
constant background over time for each frequency channel. 
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»> from sunpy . spectra . sources . callisto import CallistoSpectrogram 
»> tstart , tend = ”2011-06-07T06 : 00: 00” , ”2011-06-07T07:45 :00” 

»> callisto = CallistoSpectrogram . from.range (” BIR” , tstart , tend) 
»> callisto_nobg = ca 11 isto . subtract_bg () 

»> callisto_nobg . peek (vmin=0) 


07 Jun 2011 Radio flux density (BIR) 



Time [UT] 


listing 5: Example of how CallistoSpectrogram retrieves the data for the requested 
time range and observatory, merges it, and removes the background signal. The 
data requested - ‘BIR’ - is the code name of the Rosse Observatory http://www. 
rosseobservatory. ie at Birr Castle in Ireland. 


3. Solar Data Search and Retrieval 

Several well-developed resources currently exist which provide remote access to and data 
retrieval form a large number of solar and heliospheric data sources and event databases. 
SunPy provides support for these resources via the net subpackage. In the following 
subsections, we describe each of these resources and how to use them. 


3.1. VSO 


The Virtual Solar Observatory (VSO, http://virtualsolar.org) provides a single, 
standard query interface to solar data from many different archives around the world 


(Hill et al. 2009). Data products can be requested for specific instruments or missions 


and can also be requested based on physical parameters of the data product such as 
the wavelength range. In addition to the VSO’s primary web-based interface, a SOAP 
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(Simple Object Access Protocol) service is also available. SunPy’s vso module provides 
access to the VSO via this SOAP service using the suds package. 

Listing [6] shows an example of how to query and download data from the VSO 
using the vso module. Queries are constructed using one or more attribute objects. 
Each attribute object is a constraint on a parameter of the data set, such as the time of 
the observation, instrument, or wavelength. Listing [6] also shows how to download the 
data using the constructed query. The path to which the data hies will be downloaded is 
defined using custom tokens which reference the hie metadata (e.g., instrument, detector, 
filename). This provides users the ability to organize their data into subdirectories on 
download. 

Listing [7] shows an example of how to make an advanced query by combining 
attribute objects. Two attribute objects can be combined with a logical or operation 
using the I (pipe) operator. All attribute objects provided to the query as arguments 
are combined with a logical and operation. 


»> from sunpy. net import vso 
»> client = vso . VSO Client () 

»> tstart , ten d = ”2011/6/7 05:30”, ”2011/6/7 10:30” 

»> lasco.query = client . query (vso . attrs . Time( tstart , tend), 
... vso . at trs . Instrument (’ lasco ’)) 


»> len (lasco_query ) 


40 


»> lasco_query . show () 

Start time End time Source Instrument 

Type 


2011-06-07 05:35:23 2011-06-07 05:35:48 SOHO LASCO 

CORONA 

2011-06-07 05:43:09 2011-06-07 05:43:29 SOHO LASCO 

CORONA 


»> pathformat = ”/data/{ instrument }/{ detect or }/{ f i 1 e }. f i t s ” 
»> results = client . get (lasco_query , path = pathformat) 


listing 6: Example of querying a single instrument over a time range and downloading 
the data 
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»> condition = ( vso . attrs . Detector (” corl ” ) 

vso . attrs . Wave(125 , 135) j 
vso . attrs .Wave(165 , 175) ) ff in angstroms 
»> advanced = c 1 i ent . query ( vso . at t rs . Time (tst art , tend), condition) 
»> len (advanced) 

4434 

»> advanced . show () 

Start time End time Source Instrument 


2011-06-07 00:00:00 

2011-06-07 05:31:09 

2011-06-07 10:25:43 
2011-06-07 10:30:00 

2011-06-07 10:30:00 


2011-06-08 00:00:00 

2011-06-07 05:31:19 

2011-06-07 10:25:45 
2011-06-07 10:30:01 

2011-06-07 10:30:01 


SDO 

EVE 

PROBA2 

SWAP 

STEREO JB 

SECCHI 

STEREOS 

SECCHI 

SDO 

AIA 


listing 7: Example of an advanced VSO query using attribute objects, combining both 
data from a detector and any data that falls within two wavelength ranges, continuing 
from Listing [6} 


3.2. HEK 


The Sun is an active star and exhibits a wide range of transient phenomena (e.g., flares, 
radio bursts, coronal mass ejections) at many different time-scales, length-scales, and 
wavelengths. Observations and metadata concerning these phenomena are collected 
in the Heliophysics Event Knowledgebase (HEK, Hurlburt et ah, 2012). Entries are 
generated both by automated algorithms and human observers. Some of the information 
in the HEK reproduces feature and event data from elsewhere (for example, the GOES 
flare catalogue), and some is generated by the Solar Dynamics Observatory Feature 
Finding Team (Martens et al. 2012). A key feature of the HEK is that it provides 
an homogeneous and well-described interface to a large amount of feature and event 
information. SunPy accesses this information through the hek module. The hek module 
makes use of the HEK public APl[t} 

Simple HEK queries consist of start time, an end time, and an event type (see 
Listing [8]) . Event types are specified as upper case, two letter strings, and these strings 
are identical to the two letter abbreviations defined by HEK (see http: //www. lmsal. 
com/hek/V0Event_Spec.htral). Users can see a complete list and description of these 
abbreviations by looking at the documentation for hek. attrs.EventType. 


f For more information see http://vso.stanford.edu/hekwiki/ 

ApplicationProgramininglnterface 
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»> from sunpy.net import hek 
»> client = hek . HEKClient () 

»> tstart , ten d = ”2011/08/09 00:00:00”, ”2011/08/10 00:00:00” 
»> result = c li e nt . query ( hek . at trs . Time (tst art , tend), 

hek . attrs . EventType (” FL” )) ff FL = flare 


»> len(result) 


52 


listing 8: Example usage of the hek module showing a simple HEK search for solar flares 
on 2011 August 9. 


Short-cuts are also provided for some often-used event types. For example, the flare 
attribute can be declared as either hek. attrs. EventType ("FL") or as hek. attrs . FL. 
HEK attributes differ from VSO attributes (Section 3.1) in that many of them are 
wrappers that conveniently expose comparisons by overloading Python operators. This 
allows filtering of the HEK entries by the properties of the event. As was mentioned 
above, the HEK stores feature and event metadata obtained in different ways, known 
generally as feature recognition methods (FRMs). The example in Listing [9] repeats the 
previous HEK query (see Listing [8]) , with an additional filter enabled to return only 
those events that have the FRM ‘SSW Latest Events’. Multiple comparisons can be 
made by including more comma-separated conditions on the attributes in the call to the 
HEK query method. 


»> result = c li e nt . query ( hek . at trs . Time (tst art , tend), 

hek .attrs . EventType (” FL” ) , 

hek . attrs .FRM.Name==”SSW Latest Events”) 

»> len(result) 

9 

listing 9: An HEK query that returns only those flares that were detected by the ‘SSW 
Latest Events’ feature recognition method. 


HEK comparisons can be combined using Python’s logical operators (e.g., and and 
or). The ability to use comparison and logical operators on HEK attributes allows the 
construction of queries of arbitrary complexity. For the query in Listing 10 returns 
returns flares with helio-projective x-coordinates west of 50 arcseconds or those that 
have a peak flux above 1000.0 (in units defined by the FRM). 
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»> result = client . query (hek . attrs . Time( tstart , tend), 

hek .attrs . EventType (” FL” ) , 

... (hek . attrs . Event. Coordl >50) 

or ( hek . at trs . FL . PeakFlux > 1000.0)) 

listing 10: HEK query using the logical or operator. 

All FRMs report their required feature attributes (as defined by the HEK), but the 
optional attributes are FRM dependent]]] If a FRM does not have one of the optional 
attributes, None is returned by the hek module. 

After users have found events of interest the next step is to download observational 
data. The H2VClient module makes this easier by providing a translation layer between 
HEK query results and VSO data queries. This capability is demonstrated in Listing [IT] 

»> from sunpy. net import hek2vso 
»> h2v = hek2vso . H2VClient () 

»> vso.results = h2v . translate_and_query ( result [0]) 

»> h2v. vso.client . get ( vso.results [0] ) . wait () 

listing 11: Code snippet continuing from Listing [TO] showing the query and download of 
data from the first HEK result from the VSO. 


3.3. HELIO 


The HELiophysics Integrated Observatory (HELIO)]!] has compiled a list of web services 
which allows scientists to query and discover data throughout the heliosphere, from 


solar and magnetospheric data to planetary and inter-planetary data (Perez-Suarez 


et ah, 2012). HELIO is built with a Service-Oriented Architecture, i.e., its capabilities 


are divided into a number of tasks that are implemented as separate services. HELIO 
is made up of nine different public services, which allows scientists to search different 
catalogues of registered events, solar features, data from instruments in the heliosphere, 
and other information such as planetary or spacecraft position in time. Additionally, 
HELIO provides a service that uses a propagation model to link the data in different 
points of the solar system by its original nature (e.g., Earth auroras are a signature of 
magnetic field disturbances produced a few days before on the Sun). In addition to the 
primary, web-based interface to HELIO, its services are available via an API. 

SunPy’s hec module provides an interface to the HELIO Event Catalogue (HEC) 
service. This module was developed as part of a Google Summer of Code (GSOC) 
project in 2013. The HEC service currently provides access to 84 catalogues from 
different sources. As with all of the HELIO services, the HEC service provides results 


f See http://www.lmsal.com/hek/VOEvent_Spec.html for a list of features and their attributes, 
f For more information see http: //helio-vo. eu 
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in VOTable data format (defined by IVOA, see Ochsenbein et al. 2011). The hec 
module parses this output using the astropy. io. votable package. This format has 
the advantage of containing metadata with information like data provenance and the 
performed query. 

For example, Listing [12] shows how to obtain information from different catalogues 
of coronal mass ejections (CMEs). 


»> from sunpy . net . lielio import hec 
»> he = hec . HECClient () 

»> tstart , tend = ”2011-06-07T06 : 00: 00” , ”2011-06-07T12 : 00 :00” 
»> event-type = ”cme” 


ff From all the catalogues which contain our event type of interes 
»> catalogues = he . get_table_names () 

»> catalogues_event = [ 1 [0] for 1 in catalogues 

... if event.type in 1 [0] and ’list ’ not in 1 

ff Query all the catalogues that comes from cactus 
»> results = [ he . time_query (tstart , tend, event) 

... for event in catalogues.event if ’cactus’ in event 

»> for cat in results: 

print ’’{cat} has {nres} results ”. format (cat = cat. ID, \ 
nres = len ( cat . array )) 

__helio_hec —cactus_stereoa_cme has 4 results 
__helio_hec —cactus_stereob_cme has 3 results 
__helio_hec —cactus_soho_cme has 7 results 


listing 12: Example of querying the HEC service to multiple CME catalogues, in this 
case the ones detected automatically by the by the Computer Aided CME Tracking 
feature recognition algorithm (CACTus - http://sidc.oma.be/cactus/; Robbrecht 


et al. 2009). 


3J h Helioviewer 

SunPy provides the ability to download images hosted by the Helioviewer Project 
(http://wiki.helioviewer.org). The aim of the Hclioviewer Project is to enable 
the exploration of solar and heliospheric data from multiple data sources (such 
as instrumentation and feature/event catalogues) via easy-to-use visual interfaces. 
The Helioviewer Project have developed two client applications that allow users to 
browse images and create movies of the Sun taken by a variety of instruments: 
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http://www.helioviewer.org, a Google Maps-like web application, and http://www. 
jhelioviewer.org, a movie streaming desktop application. The Helioviewer project 
maintains archives of all its image data in JPEG2000 format (Muller et al. 2009). The 
JPEG2000 hies are typically highly compressed compared to the source FITS hies from 
which they are generated, but are still high-fidelity, and thus can be used to quickly 
visualise large amounts of data from multiple sources. SunPy is also used in Hclioviewer 
production servers to manage the download and ingestion of JPEG2000 hies from remote 
servers. 

The Helioviewer Project categorises image data based on the physical construction 
of the source instrument, using a simple hierarchy: observatory —>■ instrument —>- 
detector —)■ measurement, where >” means “provides a”. Each Helioviewer Project 
JPEG2000 hie contains metadata which are based on the original FITS header 
information, and carry sufficient information to permit overlay with other Hclioviewer 
JPEG2000 hies. Images can be accessed either as PNGs (Section 3.4.1) or as JPEG2000 
hies (Section 3.4.2). 


3.4.1. Download a PNG file The Hclioviewer API allows composition and overlay of 
images from multiple sources, based on the positioning metadata in the source FITS 
hie. SunPy accesses this overlay/composition capability through the download_png() 
method of the Helioviewer client. Listing [13] gives an example of the composition of 
three separate image layers into a single image. 
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»> from sunpy . net . helioviewer import HelioviewerClient 
»> hv = HelioviewerClient () 

»> hv . downloacLpng (” 2099/01/01” , 6, 

” [SDO, AIA, AIA,3 04,1 ,100] , [SDO, AIA, AIA,193,1,50] ,” + 
” [SOHO,LASCO,C2, white-light ,1 ,100]” , 
x0=0, y0 = 0, width = 768, height=768) 



listing 13: Acquisition of a PNG image composed from data from three separate sources. 

The first argument is the requested time of the image, and Helioviewer selects 
images closest to the requested time. In this case, the requested time is in the future 
and so Helioviewer will find the most recent available images from each source. The 
second argument refers to the image resolution in arcseconds per pixel (larger values 
mean lower resolution). The third argument is a comma-delimited string of the three 
requested image layers, the details of which are enclosed in parentheses. The image 
layers are described using the observatory —>- instrument —>- detector —>- measurement 
combination described above, along with two following numbers that denote the visibility 
and the opacity of the image layer, respectively (1/0 is visible/invisible, and opacity is 
in the range 0 —>■ 100, with 100 meaning fully opaque). The quantities xO and yO are 
the x and y centre points about which to centre the image (measured in helio-projective 
cartesian coordinates), and the width and height are the pixel values for the image 
dimensions. 

This functionality makes it simple for SunPy users to generate complex images from 
multiple, correctly overlaid, image data sources. 
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3.4.2. Download a JPEG2000 file As noted above, Hclioviewer JPEG2000 files contain 
metadata that allow positioning of the image data. There is sufficient metadata in each 
hie to permit the creation of a SunPy Map object (see Section 2.1) from a Hclioviewer 
JPEG2000 hie. This allows image data to be manipulated in the same way as any other 
map object. 

Reading JPEG2000 hies into a SunPy session requires installing two other pieces 
of software. The hrst, OpenJPEG (http://www.openjpeg.org), is an open-source 
library for reading and writing JPEG2000 hies. The other package required is Glymur 
( https://github.com/quintusdias/glymur), an interface between Python and the 
OpenJPEG libraries (note that these packages are not required to use the functionality 
described in Section 3.4.1). 


Listing [14] demonstrates the querying, downloading, reading and conversion of a 
Helioviewer JPEG2000 hie into a SunPy map object. This functionality allows users to 
visualise and manipulate Helioviewer-supplied image data in an identical fashion to a 
SunPy Map object generated from FITS data (see Section 2.1). 


»> import sunpy. map 

»> filepath = hv . download_jp2 (” 2012/07/05 00:30:00”, 

observatory = ’SDO’ , 

instrument—’HMI” , detector=”HMl” , 

. . . measurement^” continuum ” ) 

»> sunpy .map.Map( filepath ). submap ([200 , 550], [—400, —200]). peek () 


HMI continuum 2012-07-05 00:29:56.800000 


-200 



■ 40 Soo 250 300 350 400 450 500 550 

X-position [arcsec] 


225 

200 

175 

150 

125 

100 

75 

50 

25 


listing 14: Acquisition and display of a Helioviewer JPEG2000 hie as a SunPy Map 
object, images values are byte-scaled in the range 0-255. 

3.5. The File Database 

Easy access to large quantities of solar data frequently leads to data hies accumulating 
in local storage such as laptops and desktop computers. Keeping data organised and 









SunPy - Python for Solar Physics 


21 


available is typically a cumbersome task for the average user. The hie database is a 
subpackage of SunPy that addresses this problem by providing a unified database to 
store and manage information about local data hies. 

The database subpackage can make use of any database software supported by 
SQLAlchemy (http://www.sqlalchemy.org). This library was chosen since it supports 
many SQL dialects. If SQLite is selected, the database is stored as a single hie, which 
is created automatically. A server-based database, on the other hand, could be used by 
collaborators who work together on the same data from different computers: a central 
database server stores all data and the clients connect to it to read or write data. 

The database can store and manage all data that can be read via SunPy’s io 
subpackage, and direct integration with the vso module is supported. It is also possible 
to manually add hie or directory entries. The package also provides a unified data 
search via the fetchO method, which includes both local hies and hies on the VSO. 
This reduces the likelihood of downloading the same hie multiple times. When a hie 
is added to the database, the hie is scanned for metadata, and a hie hash is produced. 
The current date is associated with the entry along with metadata summaries such as 
instrument, date of observation, held of view, etc. The database also provides the ability 
to associate custom metadata to each database entry such as keywords, comments, and 
favourite tags, as well as querying the full metadata (e.g., FITS header) of each entry. 

The Database class connects to a database and allows the user to perform 
operations on it. Listing 15 shows how to connect to an in-memory database 
and download data from the VSO. These entries are automatically added to the 
database. The function len() is used to get the number of records. The function 
display .entries () displays an iterable of database entries in a formatted ASCII table. 
The headlines correspond to the attributes of the respective database entries. 

A useful feature of the database package is the support of undo and redo 
operations. This is particularly convenient in interactive sessions to easily revert 
accidental operations. This feature will also be desirable for a planned GUI frontend 
for this package. 
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»> from sunpy.net import vso 

»> from sunpy. database import Database 

»> database = Database (” sqlite :///” ) 

»> database . download ( 

vso. attrs . Time(”2012-08-05” , ”2012-08-05 00:00:05”), 
vso . attrs . Instrument ( ’AIA’)) 

»> len ( database ) 

2 

»> from sunpy . database . tables import display .entries 
»> print display .entries ( 
database , 

... [”id”, ” observation .time .start ” , ’’wavemin” , ” wavemax ” ]) 

id observation.time.start wavemin wavemax 


1 2012-08-05 00:00:01 9.4 9.4 

2 2012-08-05 00:00:02 33.5 33.5 

listing 15: Example usage of the database subpackage. 


4. Additional Functionality 

SunPy is meant to provide a consistent environment for solar data analysis. In order to 
achieve this goal SunPy provides a number of additional functions and packages which 
are used by the other SunPy modules and are made available to the user. This section 
briefly describes some of these functions. 


4-1. World Coordinate System (WCS) Coordinates 


Coordinate transformations are frequently a necessary task within the solar data 
analysis workflow. An often used transformation is from observer coordinates (e.g., 
sky coordinates) to a coordinate system that is mapped onto the solar surface (e.g., 
latitude and longitude). This transformation is necessary to compare the true physical 
distance between different solar features. This type of transformation is not unique 
to solar observations, but is not often considered by astronomical packages such as 
the Astropy coordinates package. The wcs package in SunPy implements the World 


Coordinate System (WCS) for solar coordinates as described by Thompson (2006). The 


transformations currently implemented are some of the most commonly used in solar 
data analysis, namely converting from Hclioprojective-Cartesian (HPC) to Hcliographic 
(HG) coordinates. HPC describes the positions on the Sun as angles measured from the 
center of the solar disk (usually in arcseconds) using Cartesian coordinates (X, Y). This is 
the coordinate system most often defined in solar imaging data (see for example, images 
from SDO/AIA, SOHO /EIT, and TRACE). HG coordinates express positions on the 
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Sun using longitude and latitude on the solar sphere. There are two standards for this 
coordinate system: Stonyhurst-Hcliographic, where the origin is at the intersection of the 
solar equator and the central meridian as seen from Earth, and Carrington-Heliographic, 
which is fixed to the Sun and does not depend on Earth. The implementation of these 
transformations pass through a common coordinate system called Heliocentric-Cartesian 
(HCC), where positions are expressed in true (de-projected) physical distances instead 
of angles on the celestial sphere. These transformations require some knowledge of the 
location of the observer, which is usually provided by the image header. In the cases 
where it is not provided, the observer is assumed to be at Earth. Listing [16] shows some 
examples of coordinate transforms carried out in SunPy using the wcs utilities. This 
will form the foundation for transformations functions to be used on Map objects. 

»> from sunpy import wcs 
»> wcs . convert _hg_hpc (10 , 53) 

(100.49244115330731, 767.97438321917502) 

ff Convert that position back to heliographic coordinates 
»> wcs . convert_hpc_hg (100.49 , 767.97) 

(9.9996521808465175, 52.999563684874893) 

ff Try to convert a position which is not on the Sun to HG 
»> wcs . convert_hpc_hg ( — 1500, 0) 

sunpy/wcs/wcs . py : 180 : RuntimeWarning: invalid value encountered in sqrt 
distance = q — np . sqrt ( distance ) 

(nan, nan) 

ff Convert sky coordinate to a position in HCC 
»> wcs . convert HipcHicc ( —300, 400, z=True) 

(-216716967.63331246, 288956420.9477042, 594364636.2208252) 

listing 16: Using the wcs subpackage. 

4-2. Solar Constants and units 

Physical quantities (i.e. a number associated with a unit) are an important part of 
scientific data analysis. SunPy makes use of the Quantity object provided by Astropy 
units sub-package. This object maintains the relationship between a number and its 
unit and makes it easy to convert between units. As these objects inherit from NumPy’s 
ndarray, they work well with standard representations of numbers. Using proper 
quantities inside of the code base also makes it easier to catch errors in calculations. 

SunPy is currently working on integrating quantities throughout the code base. In 
order to encourage the use of units and to enable consistency SunPy provides the sun 
subpackage which includes solar-specific data such as ephemerides and solar constants. 

The main namespace contains a number of functions that provide solar ephemerides 
such as the Sun-to-Earth distance, solar-cycle number, mean anomaly, etc. All of these 
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functions take a time as their input, which can be provided in a format compatible with 
sunpy.time.parse_time(). 

The sunpy. sun. constants module provides a number of solar-related constants 
in order to enable the calculation of derived solar values within SunPy, but also to the 
user. All solar constants are provided as Constant objects as defined in the Astropy 
units package. Each Constant object defines a Quantity, along with the constant’s 
provenance (i.e., reference) and its uncertainty. The use of this package is shown in 
Listing 17 For convenience, a number of shortcuts to frequently used constants are 


provided directly when importing the module. A larger list of constants can be accessed 
through an interface modeled on that provided by the SciPy constants package and 
is available as a dictionary called physical_constants. To view them all quickly, a 
print_all() function is available. 


»> from sunpy. sun import constants 
»> print ( constants . mass) 

Name = Solar mass 
Value = 1.9891e+30 
Error = 5e+25 
Units = kg 

Reference = Allen’s Astrophysical Quantities 4th Ed. 
ff Verify the average density of the Sun and convert to cgs 
»> ( const ants . mass/ const ants . volume ). cgs 
<Quantity 1.40851154227 g / (cm3)> 
ff Search for the age of the Sun 
»> constants . find (” age” ) 

[’age’, ’average angular size’, ’average density’, ’average intensity 
»> constants . value (’age’) , const ants . unit (’age ’) 

(4600000000.0, Unit (” yr ” )) 

listing 17: Using the sun. constants module. 


4-3. Instruments 

In addition to providing support for instrument-specific solar data via the main data 
classes Map, LightCurve, and Spectrum, some instrument-specific functions may be 
found within the instr subpackage. These functions are generally those that are unique 
to one particular solar instrument, rather than of general use, such as a function to 
construct a GOES flare event list or a function to query the LYRA timeline annotation 
hie. Currently, some support is included for the GOES , LYRA , RHESSI and IRIS 
instruments, while future developments will include support for additional missions. 
Ultimately, it is anticipated that solar missions requiring a large suite of software tools 
will each be supported via a separately maintained package that is affiliated with SunPy. 
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SunPy is a community-developed library, designed and developed for and by the solar 
physics community. Not only is all the source code publicly available online under 
the permissive 2-clause BSD licence, the whole development process is also online and 
open for anyone to contribute to. SunPy’s development makes use of the online service 
GitHub (http://github.com) and Gitjfjas its distributed version control software. 

The continued success of an open-source project depends on many factors; three of 
the most important are (1) utility and quality of the code, (2) documentation, and (3) an 
active community (Bangerth and Heister, 2013). Several tools, some specific to Python, 
are used by SunPy to make achieving these goals more accessible. To maintain high- 
quality code, a transparent and collaborative development workflow made possible by 
GitHub is used. The following conditions typically must be met before code is accepted. 


(i) The code must follow the PEP 8 Python style guidelines ( http://www.python. 
org/dev/peps/pep-0008/) to maintain consistency in the SunPy code. 

(ii) All new features require documentation in the form of doc strings as well as user 
guides. 

(iii) The code must contain unit tests to verify that the code is behaving as expected. 

(iv) Community consensus is reached that the new code is valuable and appropriately 
implemented. 


This kind of development model is widely used within the scientific Python community 
as well as by a wide variety of other projects, both open and closed source. 

Additionally, SunPy makes use of ‘continuous integration’ provided by Travis 
Cl (http://travis-ci.org), a process by which the addition of any new code 
automatically triggers a comprehensive review of the code functionality which are 
maintained as unit tests. If any single test fails, the community is alerted before the 
code is accepted. The unit-test coverage is monitored by a service called Coveralls 
(http: //coveralls. io). 

High-quality documentation is one of the most important factors determining the 
success of any software project. Powerful tools already exist in Python to support 
documentation, thanks to native Python’s focus on its own documentation. SunPy 
makes use of the Sphinx (http://sphinx-doc.org) documentation generator. Sphinx 
uses reStructuredText as its markup language, which is an easy-to-read, what-you-see-is- 
what-you-get plaintext markup syntax. It supports many output formats most notably 
HTML, as well as PDF and ePub, and provides a rich, hierarchically structured view of 
in-code documentation strings. The SunPy documentation is built automatically and is 
hosted by Read-the-Docs (http: //readthedocs. org) at http: //docs . sunpy. org. 

Communication is the key to maintaining an active community, and the SunPy 
community uses a number of different tools to facilitate communication. For immediate 
communications, an active IRC chat room (#SunPy) is hosted on freenode.net. For 

f For more information see http: //git-scm. com/ 
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more involved or less immediate needs, such as developer comments or discussions, 
an open mailing list is hosted by Google Groups. Bug tracking, code reviews, and 
feature-request discussions take place directly on GitHub. The SunPy community also 
reaches out to the wider solar physics community through presentations, functionality 
demonstrations, and informal meetups at scientific meetings. 

In order to enable the long-term development of SunPy, a formal organizational 
structure has been defined. The management of SunPy is the responsibility of the SunPy 
board, a group of elected members of the community. The board elects a lead developer 
whose is responsible for the day to day development of SunPy. SunPy also makes 
use of Python-style Enhancement proposals which can be proposed by the community 
and are voted on by the board. These proposals set the overal direction of SunPy’s 
development. 


6. Future of SunPy 


Over the three years of SunPy’s development, the code base has grown to over 17,000 
lines. SunPy is already a useful package for the analysis of calibrated solar data, and it 
continues to gain significant new capabilities with each successive release. The primary 
focus of the SunPy library is the analysis and visualisation of ‘high-level’ solar data. 
This means data that has been put through instrument processing and calibration 
routines, and contains valid metadata. The plan for SunPy is to continue development 
within this scope. The primary components of this plan are to provide a set of data 
types that are interchangeable with one another: e.g., if you slice a MapCube along 
one spatial location, a LightCurve of intensity along the time range of the MapCube 
should be returned. To achieve this goal, all the data types need to share a unified 
coordinate system architecture so that each data type is aware of what the physical type 
of its data is and how operations on that data should be performed. This will enable 
useful operations such as the coordinate and solar-rotation-aware overplotting of HELIO 
(Section |3.3[ ) and HEK results (Section |3.2[ ) onto maps (Section |2.1[ ). Finally, support 
for new data providers and services will be integrated into SunPy. For example, new 
HELIO services will be supported by SunPy, aiming for seamless interaction between 
the other services and tools available (e.g., hek, map). 

In concert with the work on the data types, further integration with the astropy 
package will enable SunPy to incorporate many new features with little effort. 


Collaboration and joint development with the Astropy project (Astropy Collaboration 


et al. 2013) is ongoing. 


7. Summary 


We have presented the release of SunPy (v0.5), a Python package for solar physics. In 
this paper we have described the main functionality which includes the SunPy data 
types, Map (see Section 2.1), Lightcurve (see Section 2.2), and Spectrogram (see 
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Section 2.3). We have described the data and event catalogue retrieval capabilities 
of SunPy for the Virtual Solar Observatory (see Section 3.1), the Heliophysics Event 
Knowledgebase (see Section 3.2), as well as the Heliophysics Integrated Observatory 
(see Section 3.3). We described a new organization tool for data hies integrated into 
SunPy (see Section 3.5) and we discussed the community aspects, development model 
(see Section [5]), and future plans (see Section [6]) for the project. We invite members of 
the community to contribute to the effort by using SunPy for their research, reporting 
bugs, and sharing new functionality with the project. 
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